AITopics

2601.08599

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Schächter, Clemens, Hackenberg, Maren, Pfaffenlehner, Michelle, Tambe-Ndonfack, Félix B., Schmidt, Thorsten, Pechmann, Astrid, Kirschner, Janbernd, Hasenauer, Jan, Binder, Harald

Using latent representations to link disjoint longitudinal data for mixed-effects regression

arXiv.org Machine LearningNov-6-2025

Many rare diseases offer limited established treatment options, leading patients to switch therapies when new medications emerge. To analyze the impact of such treatment switches within the low sample size limitations of rare disease trials, it is important to use all available data sources. This, however, is complicated when usage of measurement instruments change during the observation period, for example when instruments are adapted to specific age ranges. The resulting disjoint longitudinal data trajectories, complicate the application of traditional modeling approaches like mixed-effects regression. We tackle this by mapping observations of each instrument to a aligned low-dimensional temporal trajectory, enabling longitudinal modeling across instruments. Specifically, we employ a set of variational autoencoder architectures to embed item values into a shared latent space for each time point. Temporal disease dynamics and treatment switch effects are then captured through a mixed-effects regression model applied to latent representations. To enable statistical inference, we present a novel statistical testing approach that accounts for the joint parameter estimation of mixed-effects regression and variational autoencoders. The methodology is applied to quantify the impact of treatment switches for patients with spinal muscular atrophy. Here, our approach aligns motor performance items from different measurement instruments for mixed-effects regression and maps estimated effects back to the observed item level to quantify the treatment switch effect. Our approach allows for model selection as well as for assessing effects of treatment switching. The results highlight the potential of modeling in joint latent representations for addressing small data challenges.

artificial intelligence, instrument, machine learning, (16 more...)

2510.25531

Country:

Europe > Germany (0.29)
North America > United States (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

arXiv.org Artificial IntelligenceApr-23-2025

Dynamic Intent Queries for Motion Transformer-based Trajectory Prediction

Demmler, Tobias, Hartung, Lennart, Tamke, Andreas, Dang, Thao, Hegai, Alexander, Haug, Karsten, Mikelsons, Lars

Personal use of this material is permitted. Abstract -- In autonomous driving, accurately predicting the movements of other traffic participants is crucial, as it significantly influences a vehicle's planning processes. Modern trajectory prediction models strive to interpret complex patterns and dependencies from agent and map data. The Motion Transformer (MTR) architecture and subsequent work define the most accurate methods in common benchmarks such as the Waymo Open Motion Benchmark. The MTR model employs pre-generated static intention points as initial goal points for trajectory prediction. However, the static nature of these points frequently leads to misalignment with map data in specific traffic scenarios, resulting in unfeasible or unrealistic goal points. This adaptation of the MTR model was trained and evaluated on the Waymo Open Motion Dataset. Our findings demonstrate that incorporating dynamic intention points has a significant positive impact on trajectory prediction accuracy, especially for predictions over long time horizons. Furthermore, we analyze the impact on ground truth trajectories which are not compliant with the map data or are illegal maneuvers. Trajectory prediction is crucial for modern autonomous driving systems. It forms a deeper understanding of how other traffic participants will move in the future, which is the basis for subsequent motion planning of the autonomous vehicle.

artificial intelligence, intention point, machine learning, (18 more...)

2504.15766

Country: Europe > Germany (0.15)

Genre: Research Report > New Finding (0.68)

Industry:

Transportation > Ground > Road (0.87)
Information Technology > Robotics & Automation (0.54)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)

Leisenberger, Harald, Pernkopf, Franz

Adaptive Variational Inference in Probabilistic Graphical Models: Beyond Bethe, Tree-Reweighted, and Convex Free Energies

arXiv.org Machine LearningFeb-5-2025

Variational inference in probabilistic graphical models aims to approximate fundamental quantities such as marginal distributions and the partition function. Popular approaches are the Bethe approximation, tree-reweighted, and other types of convex free energies. These approximations are efficient but can fail if the model is complex and highly interactive. In this work, we analyze two classes of approximations that include the above methods as special cases: first, if the model parameters are changed; and second, if the entropy approximation is changed. We discuss benefits and drawbacks of either approach, and deduce from this analysis how a free energy approximation should ideally be constructed. Based on our observations, we propose approximations that automatically adapt to a given model and demonstrate their effectiveness for a range of difficult problems.

artificial intelligence, completegraph, machine learning, (16 more...)

2502.03341

Country:

Europe > Austria > Styria > Graz (0.04)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)

Neural Information Processing SystemsJan-21-2025, 19:56:17 GMT

Reviews: Statistical-Computational Tradeoff in Single Index Models

The paper first introduces first-order and second-order Stein's identity and then defines two function sets, C1 and C2, characterized by the covariance between f and X T\beta *. Further, authors define a common function set C(psi), which includes all link functions such that the second-order Stein's identity does not vanish under transformation psi. Then, authors propose a mixed model in 2.6 using two link functions f1\in C1\cap C(\psi) and f2\in C2\cap C(\psi). This model is finally used to derive lower bound. This is reasonable since true beta with link function f1 is easy to estimate (using first-order Stein's identity), while true beta with f2 is indistinguishable. The minimax rate is established in Prop 3.1.

mixed model, statistical-computational tradeoff, stein, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.40)

arXiv.org Machine LearningDec-18-2024

Conditional Diffusion Models Based Conditional Independence Testing

Yang, Yanfeng, Li, Shuai, Zhang, Yingjie, Sun, Zhuoran, Shu, Hai, Chen, Ziqi, Zhang, Renming

Conditional independence (CI) testing is a fundamental task in modern statistics and machine learning. The conditional randomization test (CRT) was recently introduced to test whether two random variables, $X$ and $Y$, are conditionally independent given a potentially high-dimensional set of random variables, $Z$. The CRT operates exceptionally well under the assumption that the conditional distribution $X|Z$ is known. However, since this distribution is typically unknown in practice, accurately approximating it becomes crucial. In this paper, we propose using conditional diffusion models (CDMs) to learn the distribution of $X|Z$. Theoretically and empirically, it is shown that CDMs closely approximate the true conditional distribution. Furthermore, CDMs offer a more accurate approximation of $X|Z$ compared to GANs, potentially leading to a CRT that performs better than those based on GANs. To accommodate complex dependency structures, we utilize a computationally efficient classifier-based conditional mutual information (CMI) estimator as our test statistic. The proposed testing procedure performs effectively without requiring assumptions about specific distribution forms or feature dependencies, and is capable of handling mixed-type conditioning sets that include both continuous and discrete variables. Theoretical analysis shows that our proposed test achieves a valid control of the type I error. A series of experiments on synthetic data demonstrates that our new test effectively controls both type-I and type-II errors, even in high dimensional scenarios.

artificial intelligence, conditional distribution, machine learning, (17 more...)

2412.11744

Country:

North America > United States > New York (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Tschalzev, Andrej, Nitschke, Paul, Kirchdorfer, Lukas, Lüdtke, Stefan, Bartelt, Christian, Stuckenschmidt, Heiner

Enabling Mixed Effects Neural Networks for Diverse, Clustered Data Using Monte Carlo Methods

arXiv.org Machine LearningJul-1-2024

Neural networks often assume independence among input data samples, disregarding correlations arising from inherent clustering patterns in real-world datasets (e.g., due to different sites or repeated measurements). Recently, mixed effects neural networks (MENNs) which separate cluster-specific 'random effects' from cluster-invariant 'fixed effects' have been proposed to improve generalization and interpretability for clustered data. However, existing methods only allow for approximate quantification of cluster effects and are limited to regression and binary targets with only one clustering feature. We present MC-GMENN, a novel approach employing Monte Carlo methods to train Generalized Mixed Effects Neural Networks. We empirically demonstrate that MC-GMENN outperforms existing mixed effects deep learning models in terms of generalization performance, time complexity, and quantification of inter-cluster variance. Additionally, MC-GMENN is applicable to a wide range of datasets, including multi-class classification tasks with multiple high-cardinality categorical features. For these datasets, we show that MC-GMENN outperforms conventional encoding and embedding methods, simultaneously offering a principled methodology for interpreting the effects of clustering patterns.

dataset, mc-gmenn, random effect, (15 more...)

2407.01115

Country:

Europe > Germany (0.04)
North America > United States > Minnesota (0.04)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Paul, Aswin, Isomura, Takuya, Razi, Adeel

On Predictive planning and counterfactual learning in active inference

arXiv.org Artificial IntelligenceMar-19-2024

Defining and thereby separating the intelligent "agent" from its embodied "environment", which then provides feedback to the agent, is crucial to model intelligent behaviour. Popular approaches, like reinforcement learning (RL), heavily employ such models containing agent-environment loops, which boils down the problem to agent(s) trying to maximise reward in the given uncertain environment Sutton and Barto [2018]. Active inference has emerged in neuroscience as a biologically plausible framework Friston [2010], which adopts a different approach to modelling intelligent behaviour compared to other contemporary methods like RL. In the active inference framework, an agent accumulates and maximises the model evidence during its lifetime to perceive, learn, and make decisions Da Costa et al. [2020], Sajid et al. [2021], Millidge et al. [2020]. However, maximising the model evidence becomes challenging when the agent encounters a highly'entropic' observation (i.e. an unexpected observation) concerning the agent's generative (world) model Da Costa et al. [2020], Sajid et al. [2021], Millidge et al. [2020]. This seemingly intractable objective of maximising model evidence (or minimising the entropy of encountered observations) is achievable by minimising an upper bound on the entropy of observations, called variational free energy Da Costa et al. [2020], Sajid et al. [2021]. Given this general foundation, active inference Friston et al. [2017] offers excellent flexibility in defining the generative model structure for a given problem and has attracted much attention in various domainsKuchling et al. [2020], Deane et al. [2020]. In this work, we develop an efficient decision-making scheme based on active inference by combining'planning' and'learning from experience'.

active inference, agent, inference, (13 more...)

doi: 10.3390/e26060484

2403.12417

Country:

Oceania > Australia (0.14)
Asia > India > Maharashtra > Mumbai (0.04)
North America > Canada > Ontario > Toronto (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.30)

Quirke, Philip, Neo, Clement, Barez, Fazl

Increasing Trust in Language Models through the Reuse of Verified Circuits

arXiv.org Artificial IntelligenceFeb-6-2024

Language Models (LMs) are increasingly used for a wide range of prediction tasks, but their training can often neglect rare edge cases, reducing their reliability. Here, we define a stringent standard of trustworthiness whereby the task algorithm and circuit implementation must be verified, accounting for edge cases, with no known failure modes. We show that a transformer model can be trained to meet this standard if built using mathematically and logically specified frameworks. In this paper, we fully verify a model for n-digit integer addition. To exhibit the reusability of verified modules, we insert the trained integer addition model into an untrained model and train the combined model to perform both addition and subtraction. We find extensive reuse of the addition circuits for both tasks, easing verification of the more complex subtractor model. We discuss how inserting verified task modules into LMs can leverage model reuse to improve verifiability and trustworthiness of language models built using them. The reuse of verified circuits reduces the effort to verify more complex composite models which we believe to be a significant step towards safety of language models.

calculation, digit, node, (17 more...)

2402.02619

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Xing, Yu, Sun, Xudong, Johansson, Karl H.

Joint Learning of Network Topology and Opinion Dynamics Based on Bandit Algorithms

arXiv.org Artificial IntelligenceJun-25-2023

We study joint learning of network topology and a mixed opinion dynamics, in which agents may have different update rules. Such a model captures the diversity of real individual interactions. We propose a learning algorithm based on multi-armed bandit algorithms to address the problem. The goal of the algorithm is to find each agent's update rule from several candidate rules and to learn the underlying network. At each iteration, the algorithm assumes that each agent has one of the updated rules and then modifies network estimates to reduce validation error. Numerical experiments show that the proposed algorithm improves initial estimates of the network and update rules, decreases prediction error, and performs better than other methods such as sparse linear regression and Gaussian process regression.

artificial intelligence, data mining, machine learning, (18 more...)

2306.15695

Country:

South America > Argentina > Patagonia > Río Negro Province > Viedma (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)